In recent years, amid calls for environmental issues, move to paperless offices has been promoted, and various techniques that handle digital documents have been proposed.
For example, Japanese Patent Laid-Open No. 2001-358863 describes a technique for scanning a paper document by a scanner, converting the scanned data into a digital document format (e.g., JPEG, PDF, or the like), and storing the converted data in image storage means.
On the other hand, Japanese Patent Laid-Open No. 8-147445 discloses a technique for detecting regions of respective properties contained in a document image, and managing the document as contents for respective regions.
Also, Japanese Patent Laid-Open No. 10-063820 discloses a technique for identifying corresponding digital information on the basis of a scanned input image, and further describes an information processing apparatus which extracts a difference between the input image and digital information, and composites the extracted difference to the identified digital information.
Furthermore, Japanese Patent Laid-Open No. 10-285378 discloses the following technique. That is, in a digital multi-function peripheral (MFP) (comprising a copy function, scan function, print function, and the like), it is confirmed if a scanned image includes a graphic code indicating a page ID, and if the graphic code is found, a database is searched for the corresponding page ID. If the page ID is found in the database, the currently scanned image is discarded, print data associated with that page ID is read out, and a print image is generated and is printed on a paper sheet by a print operation. On the other hand, if no corresponding page ID is found in the database, the scanned image is directly copied onto a paper sheet in a copy mode, or a PDL command is appended to the scanned image to convert the scanned image into a PDL format, and the converted data is transmitted in a facsimile or filing mode. Japanese Patent Laid-Open No. 10-285378 also discloses the following technique. That is, a function-extended recording device and MFP (multi-function peripheral) are used, and original data files of text data, images, and the like are stored in an image storage device. Upon recording a paper document by printing original data file, pointer information in the image storage device that stores the original data file is recorded as additional information on a cover sheet of the paper document or print information. With this technique, the original data file can be directly accessed based on the pointer information, and can be re-used (e.g., it can be edited, printed, or the like), thus reducing the quantities of paper documents to be held.
However, with the technique disclosed in Japanese Patent Laid-Open No. 2001-358863, an image scanned by the scanner can be saved as a JPEG file or PDF file with a compact information size. However, this technique cannot search for a saved file based on the printed document. Hence, when print and scan processes are repeated, the saved digital document image gradually deteriorates. Also, the image scanned by the scanner can be saved as a PDF file with a compact information size. However, this technique cannot search for a saved file based on the printed document, and it is difficult to re-use the saved document. Furthermore, during conversion into a PDF file, other processes cannot be done. On the other hand, the technique disclosed in Japanese Patent Laid-Open No. 8-147445 divides an image into a plurality of regions and allows these regions to be re-usable for respective contents. However, the contents are searched on the basis of a user's instruction, and contents to be used are determined from the found contents. Hence, upon generating a document using the stored contents, the user must determine contents to be used, thus taking a lot of trouble.
On the other hand, since the technique in Japanese Patent Laid-Open No. 10-063820 extracts difference information by searching for an original digital document corresponding to the output paper document, information additionally written on the paper document can be held as difference image. However, since the difference information directly handles a scanned image, a large storage capacity is required.
In the technique in Japanese Patent Laid-Open No. 10-285378, if no original digital document corresponding to a paper document is found, a PDL command is appended to a scanned image to convert that image into a PDL format. However, when the PDL command is merely appended to the scanned image to convert that image into the PDL format, a relatively large file size is required. Also, in case of a document file having no pointer information to an original data file, an original data file cannot be searched for.